Hedging Under Uncertainty: Regret Minimization Meets Exponentially Fast Convergence
نویسندگان
چکیده
This paper examines the problem of multi-agent learning in N -person non-cooperative games. For concreteness, we focus on the socalled “hedge” variant of the exponential weights (EW) algorithm, one of the most widely studied algorithmic schemes for regret minimization in online learning. In this multi-agent context, we show that a) dominated strategies become extinct (a.s.); and b) in generic games, pure Nash equilibria are attracting with high probability, even in the presence of uncertainty and noise of arbitrarily high variance. Moreover, if the algorithm’s step-size does not decay too fast, we show that these properties occur at a quasi-exponential rate – that is, much faster than the algorithm’s O(1/ √ T ) worst-case regret guarantee would suggest.
منابع مشابه
A Regret Minimization Approach in Product Portfolio Management with respect to Customers’ Price-sensitivity
In an uncertain and competitive environment, product portfolio management (PPM) becomes more challenging for manufacturers to decide what to make and establish the most beneficial product portfolio. In this paper, a novel approach in PPM is proposed in which the environment uncertainty, competitors’ behavior and customer’s satisfaction are simultaneously considered as the most important criteri...
متن کاملExponentially fast convergence to (strict) equilibrium via hedging
Motivated by applications to data networks where fast convergence is essential, we analyze the problem of learning in generic N-person games that admit a Nash equilibrium in pure strategies. Specifically, we consider a scenario where players interact repeatedly and try to learn from past experience by small adjustments based on local – and possibly imperfect – payoff information. For concretene...
متن کاملEfficient Regret Minimization in Non-Convex Games
We consider regret minimization in repeated games with non-convex loss functions. Minimizing the standard notion of regret is computationally intractable. Thus, we define a natural notion of regret which permits efficient optimization and generalizes offline guarantees for convergence to an approximate local optimum. We give gradient-based methods that achieve optimal regret, which in turn guar...
متن کاملHedging the climate sensitivity risks of a temperature target
This paper addresses the problem of meeting a predetermined temperature target cost-effectively under uncertainty and gradual learning on climate sensitivity. An analytical solution to a stochastic cost-minimization problem with a temperature constraint is first provided, portraying an outline of the risk hedging solution. Then, numerical stochastic scenarios with cost curves fitted to recent c...
متن کاملCongestion Games with Distance-Based Strict Uncertainty
We put forward a new model of congestion games where agents have uncertainty over the routes used by other agents. We take a non-probabilistic approach, assuming that each agent knows that the number of agents using an edge is within a certain range. Given this uncertainty, we model agents who either minimize their worst-case cost (WCC) or their worst-case regret (WCR), and study implications o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017